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Abstract. A square filled with horizontal stripes is perceived as thinner than one with vertical stripes 
(Helmholtz illusion). This is not consistent with a common belief that horizontally striped clothing 
makes a person look fatter, and studies on this problem have shown inconsistent results. Here, 
we demonstrate three factors that could have complicated the issue. First, the Helmholtz effect is 
stronger for a thin figure than for a fat one, with possible reversal for the latter. Second, we found 
large variability across participants, suggesting dependence on features to attend. Third, there was 
strong hysteresis as to the order of testing fat and thin figures, suggesting the effect of surrounding 
people in daily life. There can be yet other factors, but we should note that this apparently simple 
case of application of a geometrical illusion in daily perception should be taken as a rather complex 
phenomenon. 
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1 Results and discussion 

It is commonly believed that horizontal stripes on clothing make them look fatter. This, however, 
conflicts with a known geometrical illusion found by Helmholtz ( 1867 ); namely, a square filled with 
horizontal stripes looks taller and a square filled with vertical stripes looks wider. He mentioned the 
same effect on ladies' frocks. Thompson and Mikellidou ( 2011 ) examined the effects of stripes on 
clothing in a series of experiments and concluded that Helmholtz was correct. However, subsequently 
in 2012, an amateur scientist Val Watham in the United Kingdom won the prize in a BBC radio pro- 
gram, demonstrating that the common belief is true in an experiment of watching videos of people 
walking with horizontally or vertically striped clothes ( http://www.bbc .co.uk/radio4/ features/ sywtbas/ 
finalists/stripes/ ), with the supervision of Thompson himself. We may conclude that the Helmholtz 
illusion could be overridden by other factors in a realistic scene. There could be, however, a number 
of other factors that may also affect the results. Here, we demonstrate some of such factors, lying on 
both the stimulus and the observer sides. 

One can notice that the main female figure used in Thompson and Mikellidou ( 2011 ) is relatively 
thin and tall ( Figure la ). This contrasts with the fat men in the figure of Imai ( 1982 ). as shown in 
Figure Kb) , with which he supported the common belief; the width of the man with vertical stripes 
was judged thinner than the one with horizontal stripes by 15.37% (137 participants, SD = 9.24%). 
This large difference has led us to the possibility that the effect of stripes might depend on the shape 
of the person, which is also suggested by Experiment 3 in Thompson and Mikellidou ( 2011 ). where 
they found fading of the effect of stripes on simple cylinders as they became fatter. To test this idea, 
we conducted an experiment in which we compared the points of subjective equality (PSEs) in terms 
of the body width (i.e. when figures with horizontal and vertical stripes look equally fat) for relatively 
thin and fat figures ( Figure 2 ). 

The results are summarized in Figure 3 . where a positive PSE supports Helmholtz (a person 
with horizontal stripes appeared thinner than one with vertical stripes), whereas a negative PSE sup- 
ports the common belief (a person with horizontal stripes appeared fatter). This figure highlights three 
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Figure 1. (a) The typical visual stimuli used in Thompson and Mikellidou ( 2011 ).^ (b) A demonstration of 
fattening by horizontal stripes in Imai ( 1982 ). 

major aspects of the results that demonstrate complexity underlying this phenomenon. First, the over- 
all effect is positive and in accordance w^ith the Helmholtz illusion. Second, the thin body yielded 
significantly larger Helmholtz illusion than the fat body (F(l, 26) = 6.513, p = 0.0165, by a three-way 
mixed-design analysis of variance for block order X gender X body type). This result is consistent 
v^ith Experiment 3 of Thompson and Mikellidou ( 2011 ) using cylinders, and supports our hypothesis. 
Third, the block order had a significant effect (fat-first vs. thin-first; F(l, 26) = 4.262,/? = 0.0491). 
The effect of participants' gender was not significant (F(l, 26) = 1.439,/? = 0.241), but note that 
gender difference was not our primary interest and the numbers of participant were not matched. Also, 
there was no significant two-way or three-way interaction (/? > 0.2 for all). 

There seems to be a strong hysteresis effect, because the results in the second block are pulled 
towards those in the first block. The reason for this effect is unknown, but it may reflect response 
biases learned through the first block. We therefore analysed the results of the first block (block 1) 
separately that should reflect more naive responses of each participant (between-participant compari- 
sons). In block 1, the difference between the thin and the fat figures is more pronounced (^(28) = 4.526, 




Figure 2. Stimuli used in our experiment: (a) a pair of thin figures and (b) a pair of fat figures. 

^The figure was not reproduced correctly in the published paper, and this was copied from the original 
(R Thompson, personal communication). 
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Figure 3. Mean PSEs for the two body conditions: all averaged, block 1, and block 2. Bars show ± 1 SEM across 
participants. 

p = 0.00047). Averaged PSE was positive for the thin figure (^(14) = 3.916, p = 0.00056), but it was 
negative for the fat figure, although it was not significantly below zero (t(l4) = 0.911, p = 0.258). 

There is another factor that has not been much attended to in the literature, namely, very large vari- 
ability across participants. As seen in Figure 4 , PSEs of individual participants in block 1 are broadly 
distributed, spreading from negative to positive for both thin and fat figures. This could be partly 
because we did not instruct participants to judge on the basis of a particular criterion. Giving a more 
precise instruction could have reduced the variability, but we preferred to see more natural judgements 
as people would do in daily life. 

In summary, we have identified three factors that may underlie the inconsistency over the effects 
of stripes on clothing. First, we provided striking evidence that the effect of stripes depends on the 
body shape. Outfits with horizontal stripes can make a slender person even fitter, but the effect is not 
ensured or could be reversed for others (to be consistent with what is commonly conceived). This sug- 
gests that one practical way to make the Helmholtz illusion more effective would be to wear a long 
dress as in Figure 1(a) . which might be less useful for men. The reason is yet to be investigated, but it 
might be a general sensory or perceptual effect as the result parallels the finding that the Oppenl-Kundt 
illusion (overestimation of the interval filled with vertical lines) became weaker when the filling lines 
were longer than a certain maximum (Wackerman & Kastner, 2010 ). Second, the effect can be widely 
variable across people, making it of even less practical use. Third, the strong effect of order may imply 
that the effect of striped outfits could depend on the clothing and/or fitness of surrounding people. 

We still have not solved the entire problem, as there are other potential factors that could have 
characterized the results in the Watham experiment. First, the effect of 3D cues with the vertical stripes 
(Taya & Miura, 2007 ) may have been pronounced in videos (kinetic depth effect). Second, a direct 
comparison between two figures, also adopted by Thompson and Mikellidou ( 2011 ). may emphasize 




Figure 4. A histogram that shows the distributions of individual PSEs in block 1. 
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geometrical measures compared with the case of rating one person at one time, although our pre- 
liminary study did not support this idea. Finally, we could speculate on an intriguing possibility that 
motion should be rather crucial; horizontal stripes would yield weaker motion signals than vertical 
stripes when people walk around upright. In such a case, we would expect more blurring with horizon- 
tal stripes while more effective motion deblurring could operate with vertical stripes that would lead 
to faster (stronger) motion signals (e.g. Castet, Lorenceau, Shiffrar, & Bonnet, 1993 ). This could result 
in relative fattening of a person with horizontally striped clothing. These possibilities remain specula- 
tive, but we should consider multiple factors and individual differences, which allows us with further 
interesting questions for future studies. 

2 Methods 

Thirty-one undergraduate students participated for a course credit. One of them was omitted from the 
analysis because the overall psychometric function lay below 0.4 and the estimation of PSEs was unre- 
liable. The results from remaining 30 (19 females) were analysed. The participants observed a com- 
puter screen (Mitsubishi 23 -inch or Apple 24-inch Cinema Screen LCDs) from a distance of 48 cm 
by using a chin rest. They were naifve to the exact purpose of the experiment, and were given detailed 
explanations later during the class. 

Static computer-graphics images of thin and fat figures were generated by using a female template 
model in the Poser 8 software (SmithMicro Inc.). Each figure was dressed with a shirt with either 
vertical or horizontal stripes with the white-black ratio of 3: 1 ( Figure 2 ). The width of the black lines 
was constant (about 0.24 deg) for both body shapes. In each set, the figure with vertical stripes was 
used as a standard stimulus, and the body width of the figure with horizontal stripes was varied as a 
test stimulus by modifying the 3D parameters between the shoulder and the waist under Poser, from 
— 10% to + 10% of the standard stimulus in 5% steps. The shape of the horizontally striped figure was 
manipulated because it yielded less local distortion that could provide a cue than the vertically striped 
one. Each standard figure subtended approximately 17.8 deg vertically and 7.2 deg (thin) or 13.1 deg 
(fat) horizontally. The stimulus step was relatively coarse, because our preliminary observation sug- 
gested that the judgement was difficult with smaller differences. Given the PSEs that turned out to be 
less than 5%, this could have reduced the precision of the PSE estimates but should not have affected 
their accuracy on average. 

A method of constant stimuli was used to estimate the points of subjective equality (PSE) in per- 
ceived width of the figures between horizontal and vertical stripes. Superlab 4.5 on Windows (Cedrus 
Inc.) was used to control the experiment. Each trial started with a fixation mark. On participants' press- 
ing a key, the fixation mark was replaced with a pair of figures that were presented for 1 .8 s, shown side 
by side with a centre-to-centre distance of 13. 1 deg. The participants were asked to judge which looked 
fatter and respond by pressing a key (2 AFC). There was no time pressure for the response. The side 
of the standard stimulus was counterbalanced across trials. Each of the five test figures was presented 
20 times on each side with a randomized order, resulting in 40 judgements per test size. Fat and thin 
figures were tested in separate blocks; each participant completed one block for each body type with a 
30-s break between them, with the order counterbalanced across participants. The whole session took 
less than 25 min for each participant. Three or four participants were tested in a laboratory room at the 
same time, using separate computers and monitors. 

PSE was calculated for individual participant as the 50% point of the psychometric function that 
was estimated by the Probit analysis (Finney, 1971 ). using the glm() function of the R language. 
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